We propose a light-weight and highly efficient Joint Detection and Tracking pipeline for the task of Multi-Object Tracking using a fully-transformer architecture. It is a modified version of TransTrack, which overcomes the computational bottleneck associated with its design, and at the same time, achieves state-of-the-art MOTA score of 73.20%. The model design is driven by a transformer based backbone instead of CNN, which is highly scalable with the input resolution. We also propose a drop-in replacement for Feed Forward Network of transformer encoder layer, by using Butterfly Transform Operation to perform channel fusion and depth-wise convolution to learn spatial context within the feature maps, otherwise missing within the attention maps of the transformer. As a result of our modifications, we reduce the overall model size of TransTrack by 58.73% and the complexity by 78.72%. Therefore, we expect our design to provide novel perspectives for architecture optimization in future research related to multi-object tracking.
translated by 谷歌翻译
In this paper we take the first steps in studying a new approach to synthesis of efficient communication schemes in multi-agent systems, trained via reinforcement learning. We combine symbolic methods with machine learning, in what is referred to as a neuro-symbolic system. The agents are not restricted to only use initial primitives: reinforcement learning is interleaved with steps to extend the current language with novel higher-level concepts, allowing generalisation and more informative communication via shorter messages. We demonstrate that this approach allow agents to converge more quickly on a small collaborative construction task.
translated by 谷歌翻译
In order for artificial neural networks to begin accurately mimicking biological ones, they must be able to adapt to new exigencies without forgetting what they have learned from previous training. Lifelong learning approaches to artificial neural networks attempt to strive towards this goal, yet have not progressed far enough to be realistically deployed for natural language processing tasks. The proverbial roadblock of catastrophic forgetting still gate-keeps researchers from an adequate lifelong learning model. While efforts are being made to quell catastrophic forgetting, there is a lack of research that looks into the importance of class ordering when training on new classes for incremental learning. This is surprising as the ordering of "classes" that humans learn is heavily monitored and incredibly important. While heuristics to develop an ideal class order have been researched, this paper examines class ordering as it relates to priming as a scheme for incremental class learning. By examining the connections between various methods of priming found in humans and how those are mimicked yet remain unexplained in life-long machine learning, this paper provides a better understanding of the similarities between our biological systems and the synthetic systems while simultaneously improving current practices to combat catastrophic forgetting. Through the merging of psychological priming practices with class ordering, this paper is able to identify a generalizable method for class ordering in NLP incremental learning tasks that consistently outperforms random class ordering.
translated by 谷歌翻译
Low Earth Orbit (LEO) constellations, each comprising a large number of satellites, have become a new source of big data "from the sky". Downloading such data to a ground station (GS) for big data analytics demands very high bandwidth and involves large propagation delays. Federated Learning (FL) offers a promising solution because it allows data to stay in-situ (never leaving satellites) and it only needs to transmit machine learning model parameters (trained on the satellites' data). However, the conventional, synchronous FL process can take several days to train a single FL model in the context of satellite communication (Satcom), due to a bottleneck caused by straggler satellites. In this paper, we propose an asynchronous FL framework for LEO constellations called AsyncFLEO to improve FL efficiency in Satcom. Not only does AsynFLEO address the bottleneck (idle waiting) in synchronous FL, but it also solves the issue of model staleness caused by straggler satellites. AsyncFLEO utilizes high-altitude platforms (HAPs) positioned "in the sky" as parameter servers, and consists of three technical components: (1) a ring-of-stars communication topology, (2) a model propagation algorithm, and (3) a model aggregation algorithm with satellite grouping and staleness discounting. Our extensive evaluation with both IID and non-IID data shows that AsyncFLEO outperforms the state of the art by a large margin, cutting down convergence delay by 22 times and increasing accuracy by 40%.
translated by 谷歌翻译
Reinforcement Learning (RL) algorithms are known to scale poorly to environments with many available actions, requiring numerous samples to learn an optimal policy. The traditional approach of considering the same fixed action space in every possible state implies that the agent must understand, while also learning to maximize its reward, to ignore irrelevant actions such as $\textit{inapplicable actions}$ (i.e. actions that have no effect on the environment when performed in a given state). Knowing this information can help reduce the sample complexity of RL algorithms by masking the inapplicable actions from the policy distribution to only explore actions relevant to finding an optimal policy. This is typically done in an ad-hoc manner with hand-crafted domain logic added to the RL algorithm. In this paper, we propose a more systematic approach to introduce this knowledge into the algorithm. We (i) standardize the way knowledge can be manually specified to the agent; and (ii) present a new framework to autonomously learn these state-dependent action constraints jointly with the policy. We show experimentally that learning inapplicable actions greatly improves the sample efficiency of the algorithm by providing a reliable signal to mask out irrelevant actions. Moreover, we demonstrate that thanks to the transferability of the knowledge acquired, it can be reused in other tasks to make the learning process more efficient.
translated by 谷歌翻译
Point cloud completion, as the upstream procedure of 3D recognition and segmentation, has become an essential part of many tasks such as navigation and scene understanding. While various point cloud completion models have demonstrated their powerful capabilities, their robustness against adversarial attacks, which have been proven to be fatally malicious towards deep neural networks, remains unknown. In addition, existing attack approaches towards point cloud classifiers cannot be applied to the completion models due to different output forms and attack purposes. In order to evaluate the robustness of the completion models, we propose PointCA, the first adversarial attack against 3D point cloud completion models. PointCA can generate adversarial point clouds that maintain high similarity with the original ones, while being completed as another object with totally different semantic information. Specifically, we minimize the representation discrepancy between the adversarial example and the target point set to jointly explore the adversarial point clouds in the geometry space and the feature space. Furthermore, to launch a stealthier attack, we innovatively employ the neighbourhood density information to tailor the perturbation constraint, leading to geometry-aware and distribution-adaptive modifications for each point. Extensive experiments against different premier point cloud completion networks show that PointCA can cause a performance degradation from 77.9% to 16.7%, with the structure chamfer distance kept below 0.01. We conclude that existing completion models are severely vulnerable to adversarial examples, and state-of-the-art defenses for point cloud classification will be partially invalid when applied to incomplete and uneven point cloud data.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Single-cell technologies are revolutionizing the entire field of biology. The large volumes of data generated by single-cell technologies are high-dimensional, sparse, heterogeneous, and have complicated dependency structures, making analyses using conventional machine learning approaches challenging and impractical. In tackling these challenges, deep learning often demonstrates superior performance compared to traditional machine learning methods. In this work, we give a comprehensive survey on deep learning in single-cell analysis. We first introduce background on single-cell technologies and their development, as well as fundamental concepts of deep learning including the most popular deep architectures. We present an overview of the single-cell analytic pipeline pursued in research applications while noting divergences due to data sources or specific applications. We then review seven popular tasks spanning through different stages of the single-cell analysis pipeline, including multimodal integration, imputation, clustering, spatial domain identification, cell-type deconvolution, cell segmentation, and cell-type annotation. Under each task, we describe the most recent developments in classical and deep learning methods and discuss their advantages and disadvantages. Deep learning tools and benchmark datasets are also summarized for each task. Finally, we discuss the future directions and the most recent challenges. This survey will serve as a reference for biologists and computer scientists, encouraging collaborations.
translated by 谷歌翻译
MRI中胎儿结构的体积测量很耗时,并且容易发生错误,因此需要自动分割。由于胎盘模糊边界和胎儿脑皮层复杂的褶皱,胎盘分割和准确的胎儿脑分割进行回旋评估特别具有挑战性。在本文中,我们研究了对问题的轮廓骰子损失的使用,并将其与其他边界损失以及联合骰子和横向内向损失进行比较。通过侵蚀,扩张和XOR操作员有效地计算出每个切片的损失。我们描述了类似于轮廓骰子指标的损失的新公式。骰子损失和轮廓骰子的组合为胎盘分割提供了最佳性能。对于胎儿脑部分割,最佳性能的损失是结合骰子丢失,随后是骰子和轮廓骰子损失的骰子,其性能比其他边界损失更好。
translated by 谷歌翻译
深度学习方法已被证明可以有效地分割医学成像中的结构和病理。但是,它们需要大量注释的数据集,其手动分割是一项繁琐且耗时的任务,尤其是对于大型结构。我们提出了一种新的部分注释方法,该方法使用每次扫描中的一小部分连续注释切片,其注释工作仅等于很少的注释情况。通过仅使用带注释的块进行部分注释的培训,将有关切片的信息包含在感兴趣的结构之外,并修改批处理损失函数以仅考虑带注释的切片。为了促进低数据制度中的培训,我们使用两步优化过程。我们用两个MRI序列Trufi和Fiesta用流行的软骰子损失测试了该方法,并将完整的注释状态与部分注释与类似的注释工作进行了比较。对于TRUFI数据,与完整注释相比,部分注释的使用平均表现稍好一些,骰子得分从0.936增加到0.942,并且骰子的标准偏差(STD)大幅下降22%,平均对称表面距离(ASSD)提高15%。对于嘉年华的序列,部分注释还会在分布数据中分别降低骰子分数和ASSD指标的STD和ASSD指标分别降低27.5%和33%骰子得分从0.84到0.9,从7.46降低到4.01毫米。两步优化过程有助于部分注释分别分配和分布数据。因此,建议使用两步优化器的部分注释方法在低数据制度下改善分割性能。
translated by 谷歌翻译